How Accurately Can We Predict the Melting Points of Drug-like Compounds?

نویسندگان

  • Igor V. Tetko
  • Iurii Sushko
  • Sergii Novotarskyi
  • Luc Patiny
  • Ivan Kondratov
  • Alexander E. Petrenko
  • Larisa Charochkina
  • Abdullah M. Asiri
چکیده

This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

کالری و آب در طراحی دارو

Elucidation of how the thermodynamic parameters are determined by the molecular structures in a bimolecular interactions is becoming a fundamental driving force in the rational design of drugs. If we can determine the structure of the target and the potential drug we should be able to predict the equilibrium constant for their interaction based on simple relationships. With this information the...

متن کامل

Capturing the Crystal: Prediction of Enthalpy of Sublimation, Crystal Lattice Energy, and Melting Points of Organic Compounds

Accurate computational prediction of melting points and aqueous solubilities of organic compounds would be very useful but is notoriously difficult. Predicting the lattice energies of compounds is key to understanding and predicting their melting behavior and ultimately their solubility behavior. We report robust, predictive, quantitative structure-property relationship (QSPR) models for enthal...

متن کامل

The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS

BACKGROUND Melting point (MP) is an important property in regards to the solubility of chemical compounds. Its prediction from chemical structure remains a highly challenging task for quantitative structure-activity relationship studies. Success in this area of research critically depends on the availability of high quality MP data as well as accurate chemical structure representations in order...

متن کامل

Predicting Melting Points of Organic Molecules: Applications to Aqueous Solubility Prediction Using the General Solubility Equation.

In this work we make predictions of several important molecular properties of academic and industrial importance to seek answers to two questions: 1) Can we apply efficient machine learning techniques, using inexpensive descriptors, to predict melting points to a reasonable level of accuracy? 2) Can values of this level of accuracy be usefully applied to predicting aqueous solubility? We presen...

متن کامل

Prediction of melting points of a diverse chemical set using fuzzy regression tree

The classification and regression trees (CART) possess the advantage of being able to handlelarge data sets and yield readily interpretable models. In spite to these advantages, they are alsorecognized as highly unstable classifiers with respect to minor perturbations in the training data.In the other words methods present high variance. Fuzzy logic brings in an improvement in theseaspects due ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2014